AITopics | Atyrau

Collaborating Authors

Atyrau

Qorgau: Evaluating LLM Safety in Kazakh-Russian Bilingual Contexts

Goloburda, Maiya, Laiyk, Nurkhan, Turmakhan, Diana, Wang, Yuxia, Togmanov, Mukhammed, Mansurov, Jonibek, Sametov, Askhat, Mukhituly, Nurdaulet, Wang, Minghan, Orel, Daniil, Mujahid, Zain Muhammad, Koto, Fajri, Baldwin, Timothy, Nakov, Preslav

arXiv.org Artificial IntelligenceFeb-19-2025

Large language models (LLMs) are known to have the potential to generate harmful content, posing risks to users. While significant progress has been made in developing taxonomies for LLM risks and safety evaluation prompts, most studies have focused on monolingual contexts, primarily in English. However, language- and region-specific risks in bilingual contexts are often overlooked, and core findings can diverge from those in monolingual settings. In this paper, we introduce Qorgau, a novel dataset specifically designed for safety evaluation in Kazakh and Russian, reflecting the unique bilingual context in Kazakhstan, where both Kazakh (a low-resource language) and Russian (a high-resource language) are spoken. Experiments with both multilingual and language-specific LLMs reveal notable differences in safety performance, emphasizing the need for tailored, region-specific datasets to ensure the responsible and safe deployment of LLMs in countries like Kazakhstan. Warning: this paper contains example data that may be offensive, harmful, or biased.

dataset, information, kazakhstan, (14 more...)

arXiv.org Artificial Intelligence

2502.1364

Country:

Europe > Russia (0.06)
Asia > Russia (0.06)
Asia > Thailand > Bangkok > Bangkok (0.04)
(5 more...)

Genre: Research Report (0.82)

Industry:

Government (1.00)
Law (0.94)
Media (0.69)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Enhancing Open-Domain Table Question Answering via Syntax- and Structure-aware Dense Retrieval

Jin, Nengzheng, Li, Dongfang, Chen, Junying, Siebert, Joanna, Chen, Qingcai

arXiv.org Artificial IntelligenceSep-19-2023

Open-domain table question answering aims to provide answers to a question by retrieving and extracting information from a large collection of tables. Existing studies of open-domain table QA either directly adopt text retrieval methods or consider the table structure only in the encoding layer for table retrieval, which may cause syntactical and structural information loss during table scoring. To address this issue, we propose a syntax- and structure-aware retrieval method for the open-domain table QA task. It provides syntactical representations for the question and uses the structural header and value representations for the tables to avoid the loss of fine-grained syntactical and structural information. Then, a syntactical-to-structural aggregator is used to obtain the matching score between the question and a candidate table by mimicking the human retrieval process. Experimental results show that our method achieves the state-of-the-art on the NQ-tables dataset and overwhelms strong baselines on a newly curated open-domain Text-to-SQL dataset.

computational linguistic, representation, retrieval, (14 more...)

arXiv.org Artificial Intelligence

2309.10506

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)
(14 more...)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.62)

Add feedback